Building an Arabic Multiword Expressions Repository
نویسندگان
چکیده
We introduce a list of Arabic multiword expressions (MWE) collected from various dictionaries. The MWEs are grouped based on their syntactic type. Every constituent word in the expressions is manually annotated with its full context-sensitive morphological analysis. Some of the expressions contain semantic variables as place holders for words that play the same semantic role. In addition, we have automatically annotated a large corpus of Arabic text using a pattern-matching algorithm that considers some morphosyntactic features as expressed by a highly inflected language, such as Arabic. A sample part of the corpus is manually evaluated and the results are reported in this paper.
منابع مشابه
Building an Arabic Multiword Expressions RepositoryBuilding an Arabic Multiword Expressions RepositoryBuilding an Arabic Multiword Expressions RepositoryBuilding an Arabic Multiword Expressions RepositoryBulding an Arabic Multiword Expressions Repository
We introduce a list of Arabic multiword expressions (MWE) collected from various dictionaries. The MWEs are grouped based on their syntactic type. Every constituent word in the expressions is manually annotated with its full context-sensitive morphological analysis. Some of the expressions contain semantic variables as place holders for words that play the same semantic role. In addition, we ha...
متن کاملUnsupervised Construction of a Lexicon and a Repository of Variation Patterns for Arabic Modal Multiword Expressions
We present an unsupervised approach to build a lexicon of Arabic Modal Multiword Expressions (AM-MWEs) and a repository of their variation patterns. These novel resources are likely to boost the automatic identification and extraction of AM-MWEs.
متن کاملA Framework for the Classification and Annotation of Multiword Expressions in Dialectal Arabic
In this paper we describe a framework for classifying and annotating Egyptian Arabic Multiword Expressions (EMWE) in a specialized computational lexical resource. The framework intends to encompass comprehensive linguistic information for each MWE including: a. phonological and orthographic information; b. POS tags; c. structural information for the phrase structure of the expression; d. lexico...
متن کاملMultiword expressions: linguistic precision and reusability
This paper discusses the approach to multiword expressions being adopted in the LinGO English Resource Grammar (http://lingo.stanford.edu), a broad-scale bidirectional grammar of English in the HPSG framework. We discuss how the lexicon of multiword expressions is encoded in a database and describe the implications for building a reusable lexical resource.
متن کاملIntroduction to the special issue on multiword expressions: Having a crack at a hard nut
Multiword expressions are an integral part of language. Their heterogeneous characteristics have proved a challenge to both linguistic and computational analysis. Their importance to language technology has long been recognised. In this special issue we include ten papers which propose a variety of approaches for finding and handling these expressions, both for building general purpose lexical ...
متن کامل